Recent advances in LVCSR : A benchmark comparison of performances
نویسندگان
چکیده
Large Vocabulary Continuous Speech Recognition (LVCSR), which is characterized by a high variability of the speech, is the most challenging task in automatic speech recognition (ASR). Believing that the evaluation of ASR systems on relevant and common speech corpora is one of the key factors that help accelerating research, we present, in this paper, a benchmark comparison of the performances of the current state-of-the-art LVCSR systems over different speech recognition tasks. Furthermore, we put objectively into evidence the best performing technologies and the best accuracy achieved so far in each task. The benchmarks have shown that the Deep Neural Networks and Convolutional Neural Networks have proven their efficiency on several LVCSR tasks by outperforming the traditional Hidden Markov Models and Guaussian Mixture Models. They have also shown that despite the satisfying performances in some LVCSR tasks, the problem of large-vocabulary speech recognition is far from being solved in some others, where more research efforts are still needed.
منابع مشابه
Tonal articulatory feature for Mandarin and its application to conversational LVCSR
This paper presents our recent work on the development of a tonal Articulatory Feature (AF) for Mandarin and its application to conversational LVCSR. Motivated by the theory of Mandarin phonology, eight features for classifying the acoustic units and one feature for classifying the tone are investigated and constructed in the paper, and the AF-based tandem approach is used to improve speech rec...
متن کاملAnalysis and Comparison of Recent MLP Features for LVCSR Systems
MLP based front-ends have evolved in different ways in recent years beyond the seminal TANDEM-PLP features. This paper aims at providing a fair comparison of these recent progresses including the use of different long/short temporal inputs (PLP,MRASTA,wLP-TRAPS,DCT-TRAPS) and the use of complex architectures (bottleneck, hierarchy, multistream) that go beyond the conventional three layer MLP. F...
متن کاملN-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology
In this paper, we describe N-best 2008, the first Large Vocabulary Speech Recognition (LVCSR) benchmark evaluation held for the Dutch language. Both the accent as spoken in the Netherlands (Northern-Dutch) and in Belgium (Southern-Dutch or Flemish), will be evaluated. The evaluation tasks are broadcast news (BN) and conversational telephone speech (CTS). The N-best evaluation will take place in...
متن کاملObject Recognition Using Deep Neural Networks: A Survey
Recognition of objects using Deep Neural Networks is an active area of research and many breakthroughs have been made in the last few years. The paper attempts to indicate how far this field has progressed. The paper briefly describes the history of research in Neural Networks and describe several of the recent advances in this field. The performances of recently developed Neural Network Algori...
متن کاملImproving the velocity tracking of cruise control system by using adaptive methods
Accurate and correct performance of controller in cruise control systems is important. Hence, in such systems, controller should optimize itself against noise and probable changes in system dynamic. As a matter of fact, in this article three approaches have been conducted to-ward this purpose: MIT, direct estimation and indirect estimation. These approaches are used as controllers to track refe...
متن کامل